[CS598] Wav2Sleep Classification Model Contribution#959
Open
Hannah877 wants to merge 3 commits intosunlabuiuc:masterfrom
Open
[CS598] Wav2Sleep Classification Model Contribution#959Hannah877 wants to merge 3 commits intosunlabuiuc:masterfrom
Hannah877 wants to merge 3 commits intosunlabuiuc:masterfrom
Conversation
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Contributor
Name: Yihan Zhang
NetID/Email: yihan20 yihan20@illinois.edu
Type of contribution & Original Paper
model contribution
original paper: https://arxiv.org/abs/2411.04644
High-level description
Wav2Sleep is a multi-modal architecture designed for automated sleep staging using synchronized raw signals (e.g., ECG and Respiratory).
Key Implementation Details:
• Temporal Encoders: Utilizes specialized CNN-based feature extractors for different sampling frequencies.
• Transformer Backbone: Implements a global Transformer block to capture long-range dependencies between sleep epochs.
• Stochastic Masking: Includes a robust fusion mechanism that handles missing modalities during training, as described in the original paper.
• Custom Output Handling: Implements a sequence-aware output head to bypass standard label processing limitations in pyhealth, ensuring compatibility with sequence prediction tasks.
File guide listing which files to review
• pyhealth/models/wav2sleep.py: Core model architecture and logic.
• pyhealth/models/init.py: Registration of Wav2Sleep.
• tests/core/test_wav2sleep.py: Unit tests covering instantiation, forward pass, output shapes, and gradient computation.
• examples/sleep_staging_wav2sleep.py: Comprehensive ablation study on synthetic data.
• docs/api/models/pyhealth.models.wav2sleep.rst: API documentation.
• docs/api/models.rst: Added Wav2Sleep entry to models table
Ablation Study Summary
I evaluated the model's sensitivity to two key hyperparameters using synthetic sleep staging data (5-stage classification). The experimental setup follows the data structure of the SHHS dataset supported by PyHealth, but uses synthetic tensors for fast reproducibility.
1. Embedding Dimension Ablation: testing the impact of latent space capacity.
2. Transformer Layers Ablation: testing the impact of architectural depth.
Findings & Conclusion
Optimal Capacity: The model performs best with a 128-dimensional embedding. Increasing to 256 leads to overfitting on small-scale synthetic data.
Depth Efficiency: A 2-layer Transformer backbone is sufficient for capturing temporal dependencies in these signals; additional layers do not yield significant gains.